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Abstract 

We reduce the number of open additivity problems in quantum information theory by showing 
that four of them are equivalent. Namely, we show that the conjectures of additivity of the 
minimum output entropy of a quantum channel, additivity of the Holevo expression for the 
classical capacity of a quantum channel, additivity of the entanglement of formation, and 
strong superadditivity of the entanglement of formation, are either all true or all false. 

1 Introduction 

The study of quantum information theory has led to a number of seemingly related open 
questions that center around whether certain quantities are additive. We show that four of 
these questions are equivalent. In particular, we show that the four conjectures of 

i. additivity of the minimum entropy output of a quantum channel, 

ii. additivity of the Holevo capacity of a quantum channel, 

iii. additivity of the entanglement of formation, 

iv. strong superadditivity of the entanglement of formation, 

are either all true or all false. 

Two of the basic ingredients in our proofs are already known. The first is an observa- 
tion of Matsumoto, Shimono and Winter 1121 that the Stinespring dilation theorem relates a 
constrained version of the Holevo capacity formula to the entanglement of formation. The 
second is the realization that the entanglement of formation (or the constrained Holevo ca- 
pacity) is a linear programming problem, and so there is also a dual linear formulation. This 
formulation was first presented by Audenaert and Braunstein 1 1 1, who expressed it in the 
language of convexity rather than that of linear programming. We noted this independently 
1161 . These two ingredients are explained in Sections|3]and|5] 

The rest of this paper is organized as follows. Section [2] gives some background in 
quantum information theory, describes the additivity questions we consider, and gives brief 
histories of them. Sections [3] and [5] explain the two ingredients we describe above, and 
are positioned immediately before the first sections in which they are used. To show that 
the conditions (i) to (iv) are equivalent, in Section 0] we prove that (ii) — > (iii): additivity 
of the Holevo capacity implies additivity of entanglement of formation. In Section [6] we 
prove (iii) — * (iv): additivity of entanglement of formation implies strong superadditivity of 
entanglement of formation. This implication was independently discovered by Pomeransky 
1131 . In SectionQwe prove that (i) — > (iii): additivity of minimum entropy output implies 
additivity of entanglement of formation. In Section [8] we give simple proofs showing that 
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(iv) — > (i), (iv) — * (ii), and (iv) — * (iii). The first implication is the only one that was not in 
the literature, and we assume this is mainly because nobody had tried to prove it. The second 
of these implications was already known, but for completeness we give a proof. The third of 
these implications is trivial. 1 In Section|9]we give proofs that (ii) — > (i) and (iii) — > (i): either 
additivity of the Holevo capacity or of the entanglement of formation implies additivity 
of the minimum entropy output. These implications complete the proof of equivalence. 
Strictly speaking, the only implications we need for the proof of equivalence are those in 
Sections We include the proof in Section@]because it uses one of the techniques used 
later for SectionQwithout introducing the extra complexity of the dual linear programming 
formulation. Finally, in Section we comment on the implications of the results in our 
paper and give some open problems. 

2 Background and Results 

One of the important intellectual breakthroughs of the 20th century was the discovery and 
development of information theory. A cornerstone of this field is Shannon's proof that a 
communication channel has a well-defined information carrying capacity and his formula for 
calculating it. For communication channels that intrinsically incorporate quantum effects, 
this classical theory is no longer valid. The search for the proof of the analogous quantum 
formulae is a subarea of quantum information theory that has recently received much study. 

In the generalization of Shannon theory to the quantum realm, the definition of a stochas- 
tic communication channel generalizes to a completely positive trace-preserving linear map 
(CPT map). We call such a map a quantum channel. In this paper, we consider only finite- 
dimensional CPT maps; these take d in x d in Hermitian matrices to d out x d out Hermitian 
matrices. In particular, these maps take density matrices (trace 1 positive semidefinite ma- 
trices) to density matrices. Note that the input dimension can be different from the output 
dimension, and that these dimensions are both finite. Infinite dimensional quantum channels 
(CPT maps) are both important and interesting, but dealing with them also introduces extra 
complications that are beyond the scope of this paper. 

There are several characterizations of CPT maps. We need the characterization given by 
the Stinespring dilation theorem, which says that every CPT map can be described by an 
unitary embedding followed by a partial trace. In particular, given a finite-dimensional CPT 
map N, we can express it as 

N(p) = Tr B U(p) 

where U (p) is a unitary embedding, i.e., there is some ancillary space Hb such that U takes 
Win to Hout <S> U B by 

U{p) = VpV^ 

and V is a unitary matrix mapping H m to range(V) C Hout We also need the opera- 

tor sum characterization of CPT maps. This characterization says that any finite-dimensional 

'in fact, property (iv), strong superadditivity of Ep, seems to be in some sense the "strongest" of these equiv- 
alent statements, as it is fairly easy to show that strong superadditivity of entanglement of formation implies the 
other three additivity results whereas the reverse directions appear to require substantial work. Similarly, property 
(i) appears to be the "weakest" of these statements. 
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CPT map N can be represented as 



N{p)=Y,A kP Al 



k 



where the Ak are complex matrices satisfying 



The Holevo information 2 \ is a quantity which is associated with a probabilistic ensem- 
ble of quantum states (density matrices). If density matrix pi occurs in the ensemble with 
probability qi, the Holevo information \ of the ensemble is 



where H is the von Neumann entropy H (p) = — Tr p log p. This quantity was introduced in 
(6| m ID as a bound for the amount of information extractable by measurements from this 
ensemble of quantum states. The first published proof of this bound was given by Holevo 
|8|. It was much later shown that maximizing the Holevo capacity over all probabilistic 
ensembles of a set of quantum states gives the information transmission capacity of this set 
of quantum states; more specifically, this is the amount of classical information which can be 
transmitted asymptotically per quantum state by using codewords that are tensor products of 
these quantum states, as the length of these codewords goes to infinity ||9| ll5l . Optimizing x 
over ensembles composed of states that are potential outputs of a quantum channel gives the 
quantum capacity of this quantum channel over a restricted set of protocols, namely those 
protocols which are not allowed to send inputs entangled between different channel uses. If 
the channel is N, we call this quantity \n\ it is defined as 



where the maximization is over ensembles {pi, | Vi)} where £\ p, = 1 and | vi) S 7ii n , the 
input space of the channel N. 

The regularized Holevo capacity is 



this gives the capacity of a quantum channel to transmit classical information when inputs 
entangled between different channel uses are allowed. The question of whether the quantum 
capacity is given by the single-symbol Holevo capacity \n is the question of whether the 
capacity \n is additive; that is, whether 




(i) 



XNx®Ni — XNi + Xn 2 - 



The > relation is easy; the open question is the < relation. 



2 This has also been called the Holevo bound and the Holevo x-quantity. 
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The question of additivity of the minimum entropy output of a quantum channel was 
originally considered independently by several people, including the author, and appears to 
have been first considered in print in 111 01 . It was originally posed as a possible first step to 
proving additivity of the Holevo capacity \n- The question is whether 

minify® N 2 (\(f>M)) = minff(7V 1 (|0)(0|)) + minif(JV 2 (|^|)), 

\<t>) \<P) 10) 

where the minimization ranges over states | (f>) in the input space of the channel. Note that 
by the concavity of the von Neumann entropy, if we minimize over mixed states p — i.e., 
mirip H(N(p)) — there will always be a rank one p — \4>)(4>\ achieving the minimum. 

The statements (iii) and (iv) in our equivalence theorem both deal with entanglement. 
This is one of the stranger phenomena of quantum mechanics. Entanglement occurs when 
two (or more) quantum systems are non-classically correlated. The canonical example of 
this phenomenon is an EPR pair. This is the state of two quantum systems (called qubits, as 
they are each two-dimensional): 

^(|oi>-|io». 

Measurements on each of these two qubits separately can exhibit correlations which cannot 
be modeled by two separated classical systems |2|. 

A topic in quantum information theory that has recently attracted much study is that of 
quantifying entanglement. The entanglement of a bipartite pure state is easy to define and 
compute; this is the entropy of the partial trace over one of the two parts 

E pmc (\v)(v\) = H{Tr B \v)(v\). 

Asymptotically, two parties sharing n copies of a bipartite pure state \v)(v\ can use lo- 
cal quantum operations and classical communication (called LOCC operations) to produce 
nE-p UTe {\v){v\) — o(n) nearly perfect EPR pairs, and can similarly form n nearly perfect 
copies of \v)(v\ from nE pulc (\v)(v\) + o(n) EPR pairs |4|. This implies that a for pure state 
\v)(v\, the entropy of the partial trace is the natural quantitative measure of the amount of 
entanglement contained in \v)(v\. 

For mixed states (density matrices of rank > 1), things become more complicated. The 
amount of pure state entanglement asymptotically extractable from a state using LOCC op- 
erations (the distillable entanglement) is now no longer necessarily equal to the amount of 
pure state entanglement asymptotically required to create a state using LOCC operations (the 
entanglement cost) 1171 . In general, the entanglement cost must be at least the distillable en- 
tanglement, as LOCC operations cannot increase the amount of entanglement. 

The entanglement of formation was introduced in |5 1. Suppose we have a bipartite state 
a on a Hilbert space Ha ®T~Cb- The entanglement of formation is 

E F (a) = min y Pl ff(Tr B |^)(^|) (2) 
{pi, I vi)} i -r d 

where the minimization is over all ensembles such that Yli Pi\ v i)( v i\ = a w ^ m probabilities 
Pi satisfying Y^iPi = 1- The entanglement of formation must be at least the entanglement 
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cost, as the decomposition of the state a yielding E F (a) can be used to create a prescrip- 
tion for asymptotically constructing er® n from nE F (a) + o(n) EPR pairs. The regularized 
entanglement of formation 

lim -E F (a® n ) 
n— *oc n 

has been proven to give the entanglement cost of a quantum state |7|. As in the case of 
channel capacity, a proof of additivity, i.e., that 

E F (<Ji (8) 02) = E F (<ri) + E F (a 2 ), 

would imply that regularization is not necessary. 

The question of strong superadditivity of entanglement of formation has been previously 
considered in Q El HI This conjecture says that for all states a over a quadripartite 
system Hai <8> Ha2 <8> Hbi <8> Hb2, we have 

E F (a) > E F (Tr 2 a) + E F (Tr 1( r) 

where the entanglement of formation E F is taken over the bipartite A-B division, as in (|2j- 
This question was originally considered in relation to the question of additivity of E F . The 
strong superadditivity of entanglement of formation is known to imply both the additivity 
of entanglement of formation (trivially) and the additivity of Holevo capacity of a channel 
1121 . A proof similar to ours that additivity of E F implies strong superadditivity of E F was 
discovered independently; it appears in 1131 . 

We can now state the main result of our paper. 

Theorem 1 The following are equivalent. 

i. The additivity of the minimum entropy output of a quantum channel. Suppose we have 
two quantum channels (CPT maps) Ni (taking C dl - inXdl - in to C dl - outXdl -° ut ) and N 2 
(taking C d2 i » xd2 i " to C d2 -™ tXd2 -°^). Then 

minH((JVi ® N 2 )(\<PX<f>\)) = miii HiNxd^D) + min H(N 2 (\4>){cj)\)) 

\<t>) 10) \<t>) 

where H is the von Neumann entropy and the minimization is taken over all vectors 
I <f>) in the input space of the channels. 

ii. The additivity of the Holevo capacity of a quantum channel, Assume we have two 
quantum channels N\ and N 2 , as in ( i). Then 

XJVxigiJVa = XJVi + XN 2 , 

where \ is defined as in Eq. 0. 

Hi. Additivity of the entanglement of formation. Suppose we have two quantum states 
<J\ G Hai ^H-bi and a 2 e Hai ® H-B2- Then 

E F (o\ ® cr 2 ) = E F (p x ) + E F (a 2 ), 

where E F is defined as in Eq. (0. In particular, the entanglement of formation is 
calculated over the bipartite A-B partition. 
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iv. The strong superadditivity of the entanglement of formation. Suppose we have a den- 
sity matrix a over a quadripartite system system Hai ® Ha2 <8 Hbi ® Hb2- Then 



E F (cr) > E F (Tr 2 a) + £ F (Tr lC r), 

where the entanglement of formation is calculated over the bipartite A-B partition. 
Here, the operator Tri traces out the space Hai ®*Hbi> and Tr 2 traces out the space 

Ha2 <& Hb2- 

3 The correspondence of Matsumoto, Shimono and Winter 

Recall the definition of the Holevo capacity for a channel TV: 

XN = f max ff(iV(V ft |^X0i|)) - y2&H(N(\&)(&\)) 

Recall also the definition of entanglement of formation. For a bipartite state cr on Ha ®Hb, 
the entanglement of formation is 

-Ef(o-) = min }piH(Tr B \vi)(vi\) 

{j>i.|«i» V 

Let us define a constrained version of the Holevo capacity, which is just the Holevo 
capacity over ensembles whose average input is p. 

X n(p)= max fT(J\r£>|&X&D) ~ MI&X&D) (3) 

SjPjl«jX*il=P « « 

The paper of Matsumoto, Shimono and Winter |12| gives a connection between this 
constrained version of the Holevo capacity and the entanglement of formation, which we 
now explain. The Stinespring dilation theorem says that any quantum channel can be realized 
as a unitary transformation followed by a partial trace. Suppose we have a channel N taking 
Tiin to Ha- We can find a unitary embedding U(p) — VpV' that takes H ln to Ha ®Hb 
such that 

N(ji) = Tr B U{p) 

for all density matrices p e Hi n . Now, U maps an ensemble of input states {pi, \(f>i)} with 
p = '}2 li Pi\(t>i){it l i\ to an ensemble of states {pi, \vi) — V | cj>i)} on the bipartite system 
Ha^T-Lb such that ^ Pi I «iX u * I = CT = ^G )- 

Conversely, if we are given a bipartite state a E Ha ® Hb, we can find an input space 
7ii„ with dim 7ii n = rank cr, a density matrix p £ H ln , and a unitary embedding [/ : H\ n — > 
H ou t such that U (p) = a. We can then define N by 

N(ji) = TtgUQj,), 

establishing the same relation between N, U, p and a. Note that since we chose dim7Yi n = 
rank a — rank p, p has full rank in H\ a . 
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Since N (\4>i)((j)i\) = Tr B \vi)(vi\, we have 

XN {p) = H(N(p)) - E F {a). 

Now, suppose E F (a) is additive. I claim that xn(p) is as well, and vice versa. Let us take 
2Vi(p) = TreUiip) and N^p) = Tr B U 2 (p). If Ui( Pl ) = u x and U 2 (p 2 ) = cr 2 , then we 
have 

XJVi®JV 2 (pi ® P2) = H^-l® N 2 (pi<g> p 2 )) - E F (a 1 ®a 2 ) 

= H{N 1 {p 1 )) + H{N 2 {p 2 ))-E F {a 1 ®a 2 ) 



The first term on the right-hand side is additive, so the entanglement of formation E F is 
additive if and only if the constrained capacity xn(p) is. 

4 Additivity of x implies additivity of Ep 

Recall the definition of the Holevo capacity for a channel N: 

XN = r max H(N(y2 Pi \^)(4,i\)) - Y^ Pi H (JVfl&X&D) 
{pi, 0i>} 

where the maximization is over ensembles {pi, \ &}} with J2iPi — 1- Recall also our 
definition of a constrained version of the Holevo capacity, which is just the definition of the 
Holevo capacity with the maximization only over ensembles whose average input is p. 

X n(p)= max H (N C£ P^m)) - (l&X&D) 

Let a be the state whose entanglement of formation we are trying to compute. The MS W 
correspondence yields a channel N and an input state p so that 

N(p) = Tr B a 

and 

Xn(p)=H(N( P ))-E f (<t) 

This is very nearly the channel capacity, the only difference being that the p above is not 
necessarily the p that maximizes xn- Only one element is missing for the proof that addi- 
tivity of channel capacity implies additivity of entanglement of formation: namely making 
sure that the average density matrix for the ensemble giving the optimum channel capacity 
is equal to a desired matrix pq. This cannot be done directly 1 14 1, but we solve the problem 
indirectly. 

We now give the intuition for our proof. Suppose we could define a new channel N' 
which, instead of having capacity 

Xn = maxxjv(p) 
p 
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has capacity 

Xn> = max xn (p) + Tr pr (4) 

p 

for some fixed Hermitian matrix r. For a proper choice of r, this will ensure that the maxi- 
mum of this channel occurs at the desired p. Consider two entangled states a\ and er 2 which 
we wish to show are additive. We can find the associated channels N[ and N 2 , with the 
capacity maximized when the average input density matrix is pi and p 2 , respectively. By 
our hypothesis of additivity of channel capacity, the tensor product channel N[ N 2 has 
capacity equal to the sum of the capacities of N[ and A^. If we can now analyze the ca- 
pacity of the channel N[ g) N 2 carefully, we might be able to show that the entanglement 
of formation of Ef(&i <g> 02) is indeed the sum of Ep(ai) and Ep((J2)- We do not know 
how to define such a channel N' satisfying @. What we actually do is find a channel whose 
capacity is close to @, or more precisely a sequence of channels approximating in the 
asymptotic limit. It turns out that this will be adequate to prove the desired theorem. 

We now give the definition of our new channel N'. It takes as its input, the input to the 
channel N, along with k additional classical bits (formally, this is actually a 2 k -dimensional 
Hilbert space on which the first action of the channel is to measure it in the canonical basis). 
With probability q the channel N' sends the first part of its input through the channel N and 
discards the classical bits; with probability 1 — q the channel N makes a measurement on 
the first part of the input, and uses the results of this measurement to decide whether or not 
to send the auxiliary classical bits. When the auxiliary classical bits are not sent, an erasure 
symbol is sent to the receiver instead. When the auxiliary classical bits are sent, they are 
labeled, so the receiver knows whether he is receiving the output of the original channel or 
the auxiliary bits. 

What is the capacity of this new channel N'l Let E be the element of the POVM mea- 
surement in the case that we send the auxiliary bits (so I — E is the element of the POVM 
in the case that we do not send these bits). Now, we claim that for some set of vectors | Vi) 
and some associated set of probabilities p^ the optimum signal states of this new channel N' 
will be |«jX' y i| ® l&X&l w i tn associated probabilities Pi/2 , where b ranges over all values of 
the classical bits. 3 

We now can find bounds on the capacity of N'. Let | Vi) and pi be the optimal signal 
states and probabilities for xn' (p)- We compute 

XN'ip) = q^iN^p^v^-^piHiN^iv^ 
+ (l-Q)fc^PiTrE|«iX^| 

i 

+ (l-q) ^(TrE^^KX^D-^P^lTrEl^lH , (5) 

3 This just says that we want to use the classical part of the channel as efficiently as possible. The formal proof 
is straightforward: First, we show that it doesn't help to send superpositions of the auxiliary bits, so we can assume 
that the signal states are indeed of the form | $5 \b)(b\. Next, we show that if two signal \vi)(vi \ <g> 

and \ vi)(vi \ $5 I&2X&2 1, so not have the same probabilities associated with them, a greater capacity can be achieved 
by making these probabilities equal. 
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where H2 is the binary entropy function H2(x) = — x log a; — (1 — x) log(l — x). The first 
term is the information associated with the channel N, the second that associated with the 
auxiliary classical bits, and the third the information associated with the measurement E. 

Let p = J2iPi\ v i)( v i\ an d l et a b e the associated entangled state. We can now deduce 
from l|5} that 

XN'(p) = qXN(p) + (1 - -7)fcTrEp+ (1 - q)5 (6) 

where 5 is defined as 

8 = # 2 (TrEp) - J^PiMivi I E !«<)). 

i 

Note that < S < 1, since 6 is positive by the concavity of the entropy function H2, and is 
at most 1 since i?2(p) < 1 for < p < 1. Similarly, if we use the optimal states for xn(p), 
we find that 

XN'(p) > X N(p) + (l-q)kTrF,p (7) 
From Eq. (|6j and Eq. (0, we find that the po that maximizes the quantity 

qXN(p) + (l-q)kTrEp, (8) 

we are guaranteed to be within 1 — q of the capacity of JV'. 

We next show that we can find a measurement E such that an arbitrary density matrix po 
is a maximum of (|8j. 

Lemma 2 For any probability < q < 1, any channel N, and any fixed positive matrix po 
over the input space of N, there is a sufficiently large fco such that for k > ka we can find an 
E so that the maximum 0/(|8J occurs at pq. (This maximum need not be unique. If Xn(p) is 
not strictly concave at po, then po will be just one of several points attaining the maximum.) 

Proof: It follows from the concavity of von Neumann entropy that xn(p) is concave in p. 
The intuition is that we must choose E so that the derivative 4 of (|8} with respect to p at po 
is 0. Because we only vary over matrices with Tr p = 1, we can add any multiple of / to E 
and not change the derivative. Suppose that in the neighborhood of po, 

Xn(p) < Xn(pq) + Trr(p- p ). (9) 

That such an expression exists follows from the concavity of xn(p) and the assumption that 
po is not on the boundary of the state space, i.e., has no zero eigenvalues. A full rank p$ is 
guaranteed by the MSW correspondence. 

To make po a maximum for Eq. 0, we see from Eq. (|9jl that we need to find E so that 

^lkE = XI-r 

q 

with < E < /. This can be done by choosing k and A appropriately. □ 

4 This is the intuition. This derivative need not actually exist. 
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Now, suppose we have two entangled states g\ and a 2 for which we want to show that 
the entanglement of formation is additive. We create the channels N[ and N 2 as detailed 
above. By the additivity of channel capacity (which we're assuming), the signal states of the 
tensor product channel can be taken to be | 61) <X> | )| b 2 ) for 61, 62 any fc-bit strings, 
with probability p\ Pj /2 2k . This gives a bound on the channel capacity of at most 

XN&Ni < q(H(Ni(pi))-E F (a 1 )) + (l-q)kTrE lPl + 

+q (H(N 2 (p 2 )) - E F (a 2 )) + (1 - g)fcTrE 2i02 + 2(1 - q) (10) 

The 2(1 — q) term at the end comes from the fact that the formula (|8j is within 1 — q of the 
capacity. Now, we want to show that we can find a larger capacity than this if there is a better 
decomposition of u\ ® cr 2 , i.e., if the entanglement of formation of o\ ® a 2 is not additive. 
The central idea here is to let q go to 1; this forces fc to simultaneously go to 00. There is a 
contribution from entangled states, which goes as q 2 , a contribution from the auxiliary fc-bit 
classical channel, which goes as (1 — q)k, but which is equal in both cases, and a contribution 
from unentangled states, which goes as q(l — q). As q goes to 1, the contribution from the 
entangled states dominates the difference. 

Suppose there are a set of entangled states which give a smaller entanglement of for- 
mation for (Ti (g) (T2 than E F o\ + E F a 2 . By the MSW correspondence, this gives a set of 
signal states for the map Ni % N 2 which yield a larger constrained capacity than XNi {pi) + 
Xn 2 iP2)- We define this set of signal states for N\ ® N 2 to be the states |0j)(<^j|, and let the 
associated probabilities be 71-;. Now, using the | as signal states in N[ ® N 2 shows that 

Xn[®n^ > q 2 H(N 1 ®N 2 (p 1 ®p 2 ))-q 2 E F (a 1 ®a 2 ) 
+(l-g)feTrE 2 p 2 

This estimate comes from considering the information transmitted by the signal states \4>i)(4>i \ 
in the case (occurring with probability q 2 ) when the channels operate as <g) N 2 , as well 
as the information transmitted by the fc classical bits. 

We now consider the difference between this lower bound il Q for the capacity of iV{ ® 
N2 and the upper bound (II Oi we showed for the capacity using tensor product signal states. 
In this difference, the terms containing (1 — q)k cancel out. The remaining terms give 

> qE F (<7 1 )+qE F (<7 2 )-q 2 E F (<7 1 ®a 2 )-2{l-q) 
-q(l - q)H(N 1 (p 1 )) - q(l - q)H(N 2 (p 2 )). 

For q sufficiently close to 1, the (1 — q) terms can be made arbitrarily small, and q and q 2 
are both arbitrarily close to 1 . This difference can thus be made positive if the entanglement 
of formation is strictly subadditive, contradicting our assumption that the Holevo channel 
capacity is additive. 
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5 The linear programming formulation 



We now give the linear programming duality formulation for the constrained capacity prob- 
lem. Recall the definition of the constrained Holevo capacity 

Xn( P )= max H (N (^PilhMil)) ~ $>#(iV(|&X&|)) (11) 

EiPjI«i><*il=p 1 1 

This is a linear program, and as such it has a formulation of a dual problem that also gives 
the maximum value. This dual problem is crucial to several of our proofs. For this paper, 
we only deal with channels having finite dimensional input and output spaces. For infinite 
dimensional channels, the duality theorem fails unless the maxima are replaced by suprema. 
We have not analyzed the effects this has on the proof of our equivalence theorem, but even 
if it still holds the proofs will become more complicated. 

By the duality theorem for linear programming there is another expression for Ep(ai). 
This was observed in |T||16I . It is 

X n(p) = H(N(p)) - f(p) (12) 

where / is the linear function defined by the maximization 

max/O) such that f(\v)(v\) < H{N(\v)(v\)) for all | v) G H in , (13) 

Here Hi n is the input space for N and the maximum is taken over all linear functions 

f(p) = Trrp. 

Eqs. (I12> and Jl 3i can be proved if p is full rank by using the duality theorem of linear 
programming. The duality theorem applies directly if there are only a finite number of 
possible signal states allowed, showing the equality of the modified version of Eqs. ( II Q and 
dl2l where the constraints in (II 31 are limited to a finite number of possible signal states | Vi), 
which are also the only signal states allowed in the capacity calculation ( II It . To extend from 
all finite collections of signal states to all \v)(v\, we need to show that we can find a 

compact set of linear functions f(p) = Tr rp which suffice to satisfy Eq. J 1 31 . We can then 
use compactness to show that a limit of these functions exists, where in the limit Eqs. il It 
and ( I13> must hold on a countable set of possible signal states | dense in the set of unit 
vectors, thus showing that they hold on the set of all unit vectors | v) . The compactness 
follows from p being full rank, and H (N(\v)(v\)) < logout for all \v)(v\, where d ou t is 
the dimension of the output space of N. The case where p is not full rank can be proved by 
using the observation that the only values of the function / which are relevant in this case 
are those in the support of p. 

Equality must hold in d 1 31 for those | v) which are signal states in an optimal decompo- 
sition. This can be seen by considering the inequalities 

Xn(p) = H(N(p)) -Y^piHiNQviXviD) 

i 
i 

= H(N(p))-f(p) 
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For equality to hold, it must hold in all the terms in the summation, which are exactly the 
signal states \vi). 

6 Additivity of E F implies strong superadditivity of Ep 

In this section, we will show that additivity of entanglement of formation implies strong 
superadditivity of entanglement of formation. Another proof was discovered independently 
by Pomeransky 1 13 1; it is quite similar, although it is expressed using different terminology. 

We first give the statement of strong superadditivity. Assume we have a quadripartite 
density matrix a whose four parts are Al, A2, Bl and B2. The statement of strong super- 
additivity is that 

E F {a) > E f {Tt 2 ct) + £ F (Tr l0 -) (14) 

where Ep is the entanglement of formation when the state is considered as a bipartite state 
where the two parts are A and B; that is, 

E F (a)= min yViJ(Tt B |&)(&|). (15) 

£tJ><l*<X+il=» 1 

First, we show that it is sufficient to prove this when it is a pure state. Consider the opti- 
mal decomposition of a = TVi\4>i)(4>i\- We can apply the theorem of strong subadditivity 
to the pure states \4>i){<j)i\ to obtain decompositions Tr 1 |0j)(0,| = j P^j \ v ilj )( v ij \ an d 
Tral^X^^E^g^SX^Jhothat 

3 3 

Summing these inequalities over i gives the desired inequality. 

We now show that additivity of Ep implies strong superadditivity of Ep. Let | (j>) be 
a quadripartite pure state for we wish to show strong superadditivity. We define <j\ = 
Tr2 and 02 = Tri Now, let us use the MSW correspondence to find channels 
iVi and N 2 and density matrices p\ and p 2 such that 

iVi(pi) = Tr B cri and iV 2 (p 2 ) = Tr B er 2 

and 

XnM = HiNM) - Epfa) 

XnM = H{N 2 { P2 )) - E F {a 2 ) 

We first do an easy case which illustrates how the proof works without introducing ad- 
ditional complexities. Let d\ and d 2 be the dimensions of the input spaces of N\ and N 2 . In 
the easy case, we assume that there are d\ linearly independent signal states in an optimal 
decomposition of p\ for Xn^Pi), and d\ linearly independent signal states in an optimal 
decomposition of p 2 for xn 2 {P2)- Let these sets of signal states be It^X^i^l w i tn P r °b- 

fl) (2) (2.) (2.) 

abilities p\ ' , and with probabilities p { j , respectively. It now follows from our 
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assumption of the additivity of entanglement of formation that an optimal ensemble of signal 
states for xn ± 0N 2 (Pi ® Pi) is | ) (g) | vf') with probability p^pf^. 

Now, let us consider the dual linear function fx for the tensor product channel Ni ® N 2 . 
Since we assumed that entanglement of formation is additive, by the MSW correspondence 
Xn (p) is also additive. We claim that the dual function fx must satisfy 

/T(k w )k w l ® l«f Vfl) = ^(iVid^X^l)) + ^(iv 2 (| v f x«f D) (i6) 

for all signal states | uj 1 ^)] ^j 2 "*)- This is simply because equality must hold in the inequality 
dl 31 for all signal states. However, we now have that fx is a linear function in a d\d\ — 1 
dimensional space which has been specified on d\d\ linearly independent points; this implies 
that the linear function fx is uniquely defined. It is easy to see that it thus must be the case 
that 

/T(p)-/i(Tr 2 p) + / 2 (Tr lP ), (17) 

as this holds for the d\d\ signal states We now let | V'XV'I be the preimage of Tr^ |</>X0| under 
the channel N\ We have, from the equations d 1 3I > and that 

/iC&a|VWI) + /aCN^WI) < H{Ni ® N 2 (\tPM))- (18) 

But recall that 

/iC&alVWI) = 

/ 2 (Tn|^X^I) = M^), (19) 
because Jl 31 holds with equality for signal states, and that 

Thus, substituting into dl 81 . we find that 

E F (p x ) + E F {a 2 ) < H(Tr B \4>M), 

which is the statement for the strong superadditivity of entanglement of formation of the 
pure state \(p)((f>\. 

We now consider the case where there are fewer than d\ signal states for XNi(Pi), i = 
1, 2. We still know that the average density matrices of the signal states for Ni and N 2 are 
pi and p 2 , and that the support of these two matrices are the entire input spaces Tii.in and 
T~L 2 ,in- The argument will go as before if we can again show that the dual function fx must 
be /i(Tr 2 p) + / 2 (Trip). In this case we do not know d\d\ points of the function fx, and 
thus cannot use the same argument as above to show that fx is determined. However, there 
is more information that we have available. Namely, we know that in the neighborhood 
of the signal states | ^j- 1 " 1 ), the entropy H (Ni(\v)(v\)) must be at least the dual function 
/1 = Tyti\v)(v |, and that these two functions are equal at the signal states. If we assume 
that the derivative of H (Ni(\v)(v\)) exists at l?^ 1 X^i l> then we can conclude that this is 
also the derivative of fi = Trri|n)(t;|. For the time being we will assume that the first 
derivative of this entropy function does in fact exist. 5 

We need a lemma. 

5 In fact, I believe the function is smooth enough that these derivatives do exist. However, we find it easier to 
deal with the cases where Ni(\v)(v\) has zero eigenvalues by expressing iVi and N2 as a limit of nonsingular 
completely positive maps. 
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Lemma 3 Suppose that we have a set of unit vectors \ Vi) that span a Hilbert space TL. If 
we are given the value of f at all the vectors | u, ) as well as the value of the first derivative 
off, 



lim - /((V^l^ltJi) + e | w.mVT^iv, \+e( Wi |)) 

at all the vectors \ Vi) and for all orthogonal \ w), then f is completely determined. 



Proof: Let us use the representation f(p) = Trrp (we do not need a constant term on 
the right hand side because we need only specify / on trace 1 matrices). Suppose that 
(vi\w) = 0. We compute the derivative at | in the | w) direction: 

(Vl-e 2 (« 4 | + e{w |) r (Vl-e 2 | «i) + e\ w)) - (v t |r| v % ) 

^6{(v 1 \t\w) + (w\t\v 1 )). (20) 

The derivative in the i \ w) direction gives 

i((Vi\T\w)-(w\T\Vi)), (21) 

so a linear combination of (I20> and (121 > shows that the value of (Wj | r | w) is determined for 
all | w) orthogonal to | Vi). We also know the value of 

(Vi\r\vi), 

it follows that the value of 

(Vi \t I w) 

is determined for all | w) . Since the (vi \ span the vector space, this determines the value of 

(u | t | w) 

for all (u | and all | w) , thus determining the matrix r. □ 
We now need to compute the derivative of the entropy of TVi . Let 

N 1 (p)=Y / A iP Al 

i 

with £V A\A, = I. Then if Tr a = 0, 

H{Nx{j> + &r))-H{Ni{p)) pb -eTr[(I + log{N(p))N 1 (a)} 

= -eTr^E4( lo g^i(p))^ ( 22 ) 

Now, if the entanglement of formation is additive, then the derivative of H(Ni <g) N2) at the 

tensor product signal states lu} 1 X u i I ® l^j X w i^l mus t a l so match the derivative of the 
function fa at these points. We calculate: 

H{N! ®N 2 (p + eo-)) - #(iVi ® JV a (p)) 

« -eTr j a J] (Ag )f ® 4f)(l°g(^i ® ^(p)))^ ® 4?) 
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Now at a point p = p%® p%, 

E (4f ® 4f)a°g^i ® ^2(p))(a« ® 4 2) ) 

= ( £ < }t logi^)^) ® J + 1 ® ( E(4f log JVa(^)Ag)), 
fei fc 2 

showing that at the states | v^} ® | vj 2 ''), we have not only that fj- = fi + /2, but that 
the first derivatives (for directions a with Tr a = 0) are equal as well. Since the states 
| ) <8 | ) span the vector space, Lemma[3]shows that fr = fi + /2 everywhere, giving 
us the last element of the proof. 

The one thing remaining to do to show that the assumption that the first derivative of 
entropy exists everywhere is unnecessary. It suffices to show that there are dual functions 
/t = h + Ii sucn that Eq. {TUl holds. We do this by taking limits. For x = 1, 2 let be 
the quantum channel 

N^(p) = N x (p) + (l-q)-^—I 

^out,x 

which averages the map N x with the maximally mixed state I/d out x . Let = ® 
N^. We need to show that some limits of the dual functions f^ q \ and exist. By 
continuity of N x q ^ , they will be forced to have the desired properties (I17> . Jl 8I >. and dl9l . Let 
PT = Pi ® P2- NOW, f<jjP is a linear function with Jj^^Pt) ^ and fj?\p) — log^out,T 
for all p, so the /ir he in a compact set. Thus, some subsequence of fjP has a limit as 
q — > 1. The same argument applies to f[ q ^ and f^, so by taking these limits we find that 
the functions /i have the desired properties, completing our proof. 

7 Additivity of min H(N) implies additivity of E F . 

Suppose that we have two bipartite states for which we wish to prove that the entanglement 
of formation is additive. We use the MSW correspondence to convert this problem to a 
question about the Holevo capacity with a constrained average signal state. We thus now 
have two quantum channels Ni and N%, and two states p\ and p2- We want to show that 

XN 1 ®N 2 {P1 ® P2) = XnApi) + XN 2 {P2)- 

In fact, we need only prove the < direction of the inequality, as the > direction is easy. 
Let | V- ) and | v^) be optimal sets of signal states for \Ni (pi) an d Xn 2 {Pz), so that 

X nM = HiN^-^pN^)^^) 

i 

where p\ = SiPi^luj^X^i l> an< ^ similarly for N^- By the linear programming dual 
formulation in Section|5J we have that there is a matrix t\ such that 

XnApi) = H(N 1 (p 1 )) - Trnpi 
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and 

for all p, with equality for signal states p = juj 1 ){v^ |, and similarly for t 2 and N 2 . Suppose 
we could find a channel N[ and A^ such that 

H(N[(\v)(v\)) = H(Ni(\v)(v\)) + Ci - (v | r | v) (23) 

for all vectors | v) (similarly for N2). We know from the linear programming duality theorem 
that 

H(N{(p)) = HiNfafi+d -Tr np 
> C x 

for all input states p, with equality holding for the signal states p = \v^)(vl |. Thus, the 
minimum entropy output of N[ is C\ and of A^ is C 2 . Also, 

XNi(pi) = H(AT{( Pl )))-E^ 1) ^(l u i 1) X«f ) l)) 

i 

and similarly for A^. Now, if we assume the additivity of minimum entropy, we know that 
the minimum entropy output of A^{ ® A^ has entropy C\ + C 2 ■ We have for some probability 
distribution m on signal states \4>i), that 

(pi®p 2 ) = H{N' 1 ®N! 2 {p 1 ®p 2 ))-Y,^H{N' 1 ®N' 2 {\(j> i ){(j> i \)) 

i 

< H(N{( Pl )) + H{N' 2 {p 2 )) -Cx-C 2 
= XN{(pi) + XNi(pz) 

Now, if we can examine the construction of the channels A^{ and N 2 and show that the 
additivity of the constrained Holevo capacity for N[ and A^ implies the additivity of the 
constrained Holevo capacity for N\ and N 2 , we will be done. 

We will not be able to achieve Eq. d23l exactly, but will be able to achieve this approxi- 
mately, in much the same way we defined N' in Section^ 

Given a channel N, we define a new channel N'. On input p, with probability q the 
channel AT' outputs N(p). With probability 1 — q the channel makes a POVM measurement 
with elements E and / — E. If the measurement outcome is E, N' outputs the tensor product 
of a pure state signifying that the result was E and the maximally mixed state on k qubits. If 
the result is I — E the channel N' outputs only a pure state signifying this fact. We have 

H(N'(p)) = qH(N(p)) + H 2 (q) + (1 - < Z )fcTrE j0 + (1 - q)H 2 (TvEp). 

If we choose k and E such that 

^fcE = XI — t, 



16 



we will have 



H(N'(\v)(v\)) = qH(N(\v)(v\)) -q(v\T\v)+q\ + H 2 (q) + (1 - q)H 2 ((v | E | v)). 

The minimum entropy H(N'(\v)(v\)) is thus at least qX + #2(9)- For signal states | Vi) of 
N, H(N'(\vi)(vi\)) is at least q\ + H 2 (q) and at most qX + H 2 {q) + 1 - q. As q goes to 0, 
this is approximately a constant. We thus see that 

H(N[( Pl )) - ? Ai - H 2 (q) - (1 - q) < xn-M) < H(N[( Pl )) - qX, - H 2 {q) (24) 

Now, given two channels N\ and N 2 , we can prepare N[ and N 2 as above. If we assume 
the additivity of minimum entropy, this implies the constrained channel capacity satisfies, 
for the optimal input ensembles \(f>i),TTi, 

Xn[®n-{pi®P2) = H(N[( Pl ))+H(N^p 2 ))-J2nH(N[ ®JV2(|&X&D) 

i 

< H(N{( Pl )) + H{N' 2 { P2 )) - qX l - qX 2 - 2H 2 (q) 

< XN[{pi) + Xn 2 {P2) + 2(1 - q) 

where the first inequality follows from the assumption of additivity of the minimum entropy 
output, and the second from Eq. J24i . 

We now need to relate xn[ (pi) and XiVi (pi)- Suppose we have an ensemble of signal 
states with associated probabilities pi, and such that J2iPi\ v i)( v i\ = P- Define Cn x 

(Cn[) to be the information transmitted by channel N\ (N[) using these signal states. We 
then have 

C N , = qC Nl + (1 - q)8 x 

where 

Sx = ^(TrEp) - '^p. l H 2 ({v i | E | v^). 

i 

This shows that 



QXnApi) < XN[{pi) < qXN 1 {pi) + (1 -q) 

Also, by using the optimal set of signal states for xn 1 ^n 2 (pi <%> P2) as signal states for the 
channel N[® N 2 , we find that 

Xn[®n 2 (pi <S> P2) > Q 2 Xn 1 ®n 2 (pi <8> P2 ) 
since with probability q 2 , the channel N[ (g> N 2 simulates Ni <£> N 2 . Thus, we have that 

XNi®N 2 (Pi <8 Pi) < q~ 2 XN' x ®N! 2 (Pi ® P2) 

< q~ 2 (XN[(pi)+XN 2 (p2)) + 2(l-q)q~ 2 

< q' 1 (XN 1 {Pi)+XN 2 {p2))+^(l-q)q' 2 

holds for all q, < q < 1. Letting g go to 1, we have subadditivity of the constrained Holevo 
capacity, implying additivity of the entanglement of formation. 
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8 Implications of strong superadditivity of E F . 

All three additivity properties (i) to (iii) follow easily from the assumption of strong su- 
peradditivity of Ep, The additivity of Ep follows trivially from this assumption. That the 
additivity of \n follows is known [ 12|. We repeat this argument below for completeness. 
Recall the definition of \n- 



Suppose that this the maximum is attained at an ensemble p^ \ <f>i) that is not a tensor prod- 
uct distribution. If we replace this ensemble with the product of the marginal ensembles, the 
concavity of von Neumann entropy implies that the first term increases, and the superaddi- 
tivity of entanglement of formation implies that the second term decreases, showing that we 
can do at least as well by using a tensor product distribution, and that \n is thus additive. 

Finally, the proof that strong superadditivity of Ep implies additivity of minimum output 
entropy is equally easy, although I am not aware of its being in the literature. Suppose that we 
have a minimum entropy output xn ± ®n 2 i\ < f ) )( c t>\) ■ The strong superadditivity of Ep implies 
that there are ensembles p^ 1 ' , | v^) andp^, | ) such that 



But the two sums on the right hand side are averages, so there must be one quantum state in 
each of these sums have smaller output entropy than the average output entropy; this shows 
additivity of the minimum entropy output. 

9 Additivity of xn or of E F implies additivity of min H(N). 

Suppose we have two channels Ni and N2 which map their input onto c?-dimensional output 
spaces. We can assume that the two output dimensions are the same by embedding the 
smaller dimensional output space into a larger dimensional one. 6 We will define two new 
channels N[ and N^. The channel N[ will take as input the tensor product of the input 
space of channel N\ and an integer between and d 2 — 1. Now, let X$ . . . X l j2_ 1 be the 
<i-dimensional generalization of the Pauli matrices: Xda+b = T a R b , where T takes | j) to 
I j + l(mod d)) and R takes | j) to e 2rri ^ d | j). Let 



Now, suppose that is the input giving the minimal entropy output Ni(\vi)(vi |). 

We claim that a good ensemble of signal states for the channel N[ is | £g> where 

i = 0, 1, . . . , d 2 — 1, with equal probabilities. This is because for this set of signal states, 
the first term in the formula for Holevo capacity Q is maximized (taking any state p and 
averaging over all XipXj gives the maximally mixed state, which has the largest possible 
entropy in d dimensions), and the second term is minimized. The same holds for the channel 

6 This is not necessary for the proof, but it reduces the number of subscripts required to express it. 




(25) 



N{(p®m\) = X i N 1 (p)Xl. 
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N^. Now, suppose there is some state |u>)(u>| which has smaller output entropy for the chan- 
nel Ni (g> N2 than H (Ni(\vi){vi |) + -ff (A^di^X - ^ 1))- We can use the ensemble containing 
states |w)(u>| (gi \ii, Z2)(*i, 12], for «i, 12 = . . . d? — 1, with equal probabilities, to obtain a 
larger capacity for the tensor product channel N[ ® N^. 

The above argument works equally well to show that additivity of entanglement of for- 
mation implies additivity of minimum entropy output. We know that to achieve the maxi- 
mum capacity, the average output state must be the maximally mixed state, so we can equally 
well use the fact that the constrained Holevo capacity \n {p) is additive to show that the min- 
imum entropy output is additive. 

10 Discussion 

We have shown that four open additivity questions are equivalent. This makes these ques- 
tions of even greater interest to quantum information theorists. Unfortunately, our techniques 
do not appear to be powerful enough to resolve these questions. 

The relative difficulty of the proofs of the implications given in this paper would seem 
to imply that of these equivalent conjectures, additivity of minimum entropy output is in 
some sense the "easiest" and strong superadditivity of Ep is in some sense the "hardest." 
One might thus try to prove additivity of the minimum entropy output as a means of solving 
all of these equivalent conjectures. One step towards solving this problem might be a proof 
that the tensor product of states producing locally minimum output entropy gives a local 
minimum of output entropy in the tensor product channel. 
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